{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "fO0xC1Cfp5-U"
   },
   "source": [
    "# Ungraded Lab: Using a Simple RNN for forecasting\n",
    "\n",
    "In this lab, you will start to use recurrent neural networks (RNNs) to build a forecasting model. In particular, you will:\n",
    "\n",
    "* build a stacked RNN using `simpleRNN` layers\n",
    "* use `Lambda` layers to reshape the input and scale the output\n",
    "* use the Huber loss during training\n",
    "* use batched data windows to generate model predictions\n",
    "\n",
    "You will train this on the same synthetic dataset from last week so the initial steps will be the same. Let's begin!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "w5kDhl0kp4ML"
   },
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "BOjujz601HcS"
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ppd0VK_4p1c8"
   },
   "source": [
    "## Utilities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Zswl7jRtGzkk"
   },
   "outputs": [],
   "source": [
    "def plot_series(time, series, format=\"-\", start=0, end=None):\n",
    "    \"\"\"\n",
    "    Visualizes time series data\n",
    "\n",
    "    Args:\n",
    "      time (array of int) - contains the time steps\n",
    "      series (array of int) - contains the measurements for each time step\n",
    "      format - line style when plotting the graph\n",
    "      start - first time step to plot\n",
    "      end - last time step to plot\n",
    "    \"\"\"\n",
    "\n",
    "    # Setup dimensions of the graph figure\n",
    "    plt.figure(figsize=(10, 6))\n",
    "    \n",
    "    if type(series) is tuple:\n",
    "\n",
    "      for series_num in series:\n",
    "        # Plot the time series data\n",
    "        plt.plot(time[start:end], series_num[start:end], format)\n",
    "\n",
    "    else:\n",
    "      # Plot the time series data\n",
    "      plt.plot(time[start:end], series[start:end], format)\n",
    "\n",
    "    # Label the x-axis\n",
    "    plt.xlabel(\"Time\")\n",
    "\n",
    "    # Label the y-axis\n",
    "    plt.ylabel(\"Value\")\n",
    "\n",
    "    # Overlay a grid on the graph\n",
    "    plt.grid(True)\n",
    "\n",
    "    # Draw the graph on screen\n",
    "    plt.show()\n",
    "\n",
    "def trend(time, slope=0):\n",
    "    \"\"\"\n",
    "    Generates synthetic data that follows a straight line given a slope value.\n",
    "\n",
    "    Args:\n",
    "      time (array of int) - contains the time steps\n",
    "      slope (float) - determines the direction and steepness of the line\n",
    "\n",
    "    Returns:\n",
    "      series (array of float) - measurements that follow a straight line\n",
    "    \"\"\"\n",
    "\n",
    "    # Compute the linear series given the slope\n",
    "    series = slope * time\n",
    "\n",
    "    return series\n",
    "\n",
    "def seasonal_pattern(season_time):\n",
    "    \"\"\"\n",
    "    Just an arbitrary pattern, you can change it if you wish\n",
    "    \n",
    "    Args:\n",
    "      season_time (array of float) - contains the measurements per time step\n",
    "\n",
    "    Returns:\n",
    "      data_pattern (array of float) -  contains revised measurement values according \n",
    "                                  to the defined pattern\n",
    "    \"\"\"\n",
    "\n",
    "    # Generate the values using an arbitrary pattern\n",
    "    data_pattern = np.where(season_time < 0.4,\n",
    "                    np.cos(season_time * 2 * np.pi),\n",
    "                    1 / np.exp(3 * season_time))\n",
    "    \n",
    "    return data_pattern\n",
    "\n",
    "def seasonality(time, period, amplitude=1, phase=0):\n",
    "    \"\"\"\n",
    "    Repeats the same pattern at each period\n",
    "\n",
    "    Args:\n",
    "      time (array of int) - contains the time steps\n",
    "      period (int) - number of time steps before the pattern repeats\n",
    "      amplitude (int) - peak measured value in a period\n",
    "      phase (int) - number of time steps to shift the measured values\n",
    "\n",
    "    Returns:\n",
    "      data_pattern (array of float) - seasonal data scaled by the defined amplitude\n",
    "    \"\"\"\n",
    "    \n",
    "    # Define the measured values per period\n",
    "    season_time = ((time + phase) % period) / period\n",
    "\n",
    "    # Generates the seasonal data scaled by the defined amplitude\n",
    "    data_pattern = amplitude * seasonal_pattern(season_time)\n",
    "\n",
    "    return data_pattern\n",
    "\n",
    "def noise(time, noise_level=1, seed=None):\n",
    "    \"\"\"Generates a normally distributed noisy signal\n",
    "\n",
    "    Args:\n",
    "      time (array of int) - contains the time steps\n",
    "      noise_level (float) - scaling factor for the generated signal\n",
    "      seed (int) - number generator seed for repeatability\n",
    "\n",
    "    Returns:\n",
    "      noise (array of float) - the noisy signal\n",
    "    \"\"\"\n",
    "\n",
    "    # Initialize the random number generator\n",
    "    rnd = np.random.RandomState(seed)\n",
    "\n",
    "    # Generate a random number for each time step and scale by the noise level\n",
    "    noise = rnd.randn(len(time)) * noise_level\n",
    "    \n",
    "    return noise"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "B3ZFrzpCpyTo"
   },
   "source": [
    "## Generate the Synthetic Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "-UJCR61zleGM"
   },
   "outputs": [],
   "source": [
    "# Parameters\n",
    "time = np.arange(4 * 365 + 1, dtype=\"float32\")\n",
    "baseline = 10\n",
    "amplitude = 40\n",
    "slope = 0.05\n",
    "noise_level = 5\n",
    "\n",
    "# Create the series\n",
    "series = baseline + trend(time, slope) + seasonality(time, period=365, amplitude=amplitude)\n",
    "\n",
    "# Update with noise\n",
    "series += noise(time, noise_level, seed=42)\n",
    "\n",
    "# Plot the results\n",
    "plot_series(time, series)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "RbW2GxrgpuZF"
   },
   "source": [
    "## Split the Dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "jhkuHzlol2VW"
   },
   "outputs": [],
   "source": [
    "# Define the split time\n",
    "split_time = 1000\n",
    "\n",
    "# Get the train set \n",
    "time_train = time[:split_time]\n",
    "x_train = series[:split_time]\n",
    "\n",
    "# Get the validation set\n",
    "time_valid = time[split_time:]\n",
    "x_valid = series[split_time:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TP6a2ckVqThP"
   },
   "source": [
    "## Prepare Features and Labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "SawHhwx3qOMC"
   },
   "outputs": [],
   "source": [
    "# Parameters\n",
    "window_size = 20\n",
    "batch_size = 32\n",
    "shuffle_buffer_size = 1000"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You will be using `SimpleRNN` layers later and as mentioned in its [documentation](https://www.tensorflow.org/api_docs/python/tf/keras/layers/SimpleRNN#call_arguments), these expect a 3-dimensional tensor input with the shape `[batch, timesteps, feature`]. With that, you need to reshape your window from `(32, 20)` to `(32, 20, 1)`. This means the 20 data points in the window will be mapped to 20 timesteps of the RNN. To implement this, you will add an `expand_dims()` to the `windowed_dataset()` function you used in the previous labs. \n",
    "\n",
    "_Note: Technically, you will only need this extra line if you don't specify the input shape as you will do later when you build the model. Nonetheless, it is best practice to define transformations like this, especially in data pipelines. It can help make debugging easier in case you have problems later on._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "4sTTIOCbyShY"
   },
   "outputs": [],
   "source": [
    "def windowed_dataset(series, window_size, batch_size, shuffle_buffer):\n",
    "    \"\"\"Generates dataset windows\n",
    "\n",
    "    Args:\n",
    "      series (array of float) - contains the values of the time series\n",
    "      window_size (int) - the number of time steps to include in the feature\n",
    "      batch_size (int) - the batch size\n",
    "      shuffle_buffer(int) - buffer size to use for the shuffle method\n",
    "\n",
    "    Returns:\n",
    "      dataset (TF Dataset) - TF Dataset containing time windows\n",
    "    \"\"\"\n",
    "\n",
    "    # Add an axis for the feature dimension of RNN layers\n",
    "    series = tf.expand_dims(series, axis=-1)\n",
    "\n",
    "    # Generate a TF Dataset from the series values\n",
    "    dataset = tf.data.Dataset.from_tensor_slices(series)\n",
    "    \n",
    "    # Window the data but only take those with the specified size\n",
    "    dataset = dataset.window(window_size + 1, shift=1, drop_remainder=True)\n",
    "    \n",
    "    # Flatten the windows by putting its elements in a single batch\n",
    "    dataset = dataset.flat_map(lambda window: window.batch(window_size + 1))\n",
    "\n",
    "    # Create tuples with features and labels \n",
    "    dataset = dataset.map(lambda window: (window[:-1], window[-1]))\n",
    "\n",
    "    # Shuffle the windows\n",
    "    dataset = dataset.shuffle(shuffle_buffer)\n",
    "    \n",
    "    # Create batches of windows\n",
    "    dataset = dataset.batch(batch_size)\n",
    "    \n",
    "    # Optimize the dataset for training\n",
    "    dataset = dataset.cache().prefetch(1)\n",
    "    \n",
    "    return dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "hvmPMWqhqkUh"
   },
   "outputs": [],
   "source": [
    "# Generate the dataset windows\n",
    "dataset = windowed_dataset(x_train, window_size, batch_size, shuffle_buffer_size)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "HDUsp8n99Zhd"
   },
   "outputs": [],
   "source": [
    "# Print shapes of feature and label\n",
    "for window in dataset.take(1):\n",
    "  print(f'shape of feature: {window[0].shape}')\n",
    "  print(f'shape of label: {window[1].shape}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hvzrTbBLHTx2"
   },
   "source": [
    "## Build the Model\n",
    "\n",
    "Your model is composed mainly of [SimpleRNN](https://www.tensorflow.org/api_docs/python/tf/keras/layers/SimpleRNN) layers. As mentioned in the lectures, this type of RNN simply routes its output back to the input. You will stack two of these layers in your model so the first one should                    have `return_sequences` set to `True`. \n",
    "\n",
    "Normally, you can just have a `Dense` layer output as shown in the previous labs. However, you can help the training by scaling up the output to around the same figures as your labels. This will depend on the [activation functions](https://en.wikipedia.org/wiki/Activation_function#Table_of_activation_functions) you used in your model. `SimpleRNN` uses *tanh* by default and that has an output range of `[-1,1]`. You will use a `Lambda` layer to scale the output by 100 before it adjusts the layer weights. `Lambda` layers can be a useful tool to experiment with simple transformations like this. Feel free to remove this layer later after this lab and see what results you get."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "L4nblWkqg1NL"
   },
   "outputs": [],
   "source": [
    "# Build the Model\n",
    "model_tune = tf.keras.models.Sequential([\n",
    "    tf.keras.Input(shape=(window_size, 1)),\n",
    "    tf.keras.layers.SimpleRNN(40, return_sequences=True),\n",
    "    tf.keras.layers.SimpleRNN(40),\n",
    "    tf.keras.layers.Dense(1),\n",
    "    tf.keras.layers.Lambda(lambda x: x * 100.0)\n",
    "])\n",
    "\n",
    "# Print the model summary\n",
    "model_tune.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "0SI_wtyh4PV1"
   },
   "source": [
    "## Tune the Learning Rate\n",
    "\n",
    "You will then tune the learning rate as before. You will define a learning rate schedule that changes this hyperparameter dynamically. You will use the [Huber Loss](https://en.wikipedia.org/wiki/Huber_loss) as your loss function to minimize sensitivity to outliers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "4JuO12ZOq4Gy"
   },
   "outputs": [],
   "source": [
    "# Set the learning rate scheduler\n",
    "lr_schedule = tf.keras.callbacks.LearningRateScheduler(\n",
    "    lambda epoch: 1e-8 * 10**(epoch / 20))\n",
    "\n",
    "# Initialize the optimizer\n",
    "optimizer = tf.keras.optimizers.SGD(momentum=0.9)\n",
    "\n",
    "# Set the training parameters\n",
    "model_tune.compile(loss=tf.keras.losses.Huber(), optimizer=optimizer)\n",
    "\n",
    "# Train the model\n",
    "history = model_tune.fit(dataset, epochs=100, callbacks=[lr_schedule])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "JnwxnPNhdwUI"
   },
   "source": [
    "You can visualize the results and pick an optimal learning rate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "5He3pp-Hj758"
   },
   "outputs": [],
   "source": [
    "# Define the learning rate array\n",
    "lrs = 1e-8 * (10 ** (np.arange(100) / 20))\n",
    "\n",
    "# Set the figure size\n",
    "plt.figure(figsize=(10, 6))\n",
    "\n",
    "# Set the grid\n",
    "plt.grid(True)\n",
    "\n",
    "# Plot the loss in log scale\n",
    "plt.semilogx(lrs, history.history[\"loss\"])\n",
    "\n",
    "# Increase the tickmarks size\n",
    "plt.tick_params('both', length=10, width=1, which='both')\n",
    "\n",
    "# Set the plot boundaries\n",
    "plt.axis([1e-8, 1e-3, 0, 50])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "lKLJz8umd1wO"
   },
   "source": [
    "You can change the boundaries of the graph if you want to zoom in. The cell below chooses a narrower range so you can see more clearly where the graph becomes unstable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Q-XbgGzpuuVF"
   },
   "outputs": [],
   "source": [
    "# Set the figure size\n",
    "plt.figure(figsize=(10, 6))\n",
    "\n",
    "# Set the grid\n",
    "plt.grid(True)\n",
    "\n",
    "# Plot the loss in log scale\n",
    "plt.semilogx(lrs, history.history[\"loss\"])\n",
    "\n",
    "# Increase the tickmarks size\n",
    "plt.tick_params('both', length=10, width=1, which='both')\n",
    "\n",
    "# Set the plot boundaries\n",
    "plt.axis([1e-7, 1e-4, 0, 20])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PD8-TGbAJUNj"
   },
   "source": [
    "## Train the Model\n",
    "\n",
    "You can then declare the model again and train with the learning rate you picked. It is set to `1e-6`by default but feel free to change it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "6y1KMowRkHkC"
   },
   "outputs": [],
   "source": [
    "# Build the model\n",
    "model = tf.keras.models.Sequential([\n",
    "    tf.keras.Input(shape=(window_size,1)),\n",
    "    tf.keras.layers.SimpleRNN(40, return_sequences=True),\n",
    "    tf.keras.layers.SimpleRNN(40),\n",
    "    tf.keras.layers.Dense(1),\n",
    "    tf.keras.layers.Lambda(lambda x: x * 100.0)\n",
    "])\n",
    "\n",
    "# Set the learning rate\n",
    "learning_rate = 1e-6\n",
    "\n",
    "# Set the optimizer \n",
    "optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=0.9)\n",
    "\n",
    "# Set the training parameters\n",
    "model.compile(loss=tf.keras.losses.Huber(),\n",
    "              optimizer=optimizer,\n",
    "              metrics=[\"mae\"])\n",
    "\n",
    "# Train the model\n",
    "history = model.fit(dataset,epochs=100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ewNczbMaJz5Q"
   },
   "source": [
    "## Model Prediction\n",
    "\n",
    "Now it's time to generate the model predictions for the validation set time range. The model is a lot bigger than the ones you used before and the sequential nature of RNNs (i.e. inputs go through a series of time steps as opposed to parallel processing) can make predictions a bit slow. You can observe this when using the code you ran in the previous lab. This will take about a minute to complete."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ejBynEKekaKw"
   },
   "outputs": [],
   "source": [
    "# Initialize a list\n",
    "forecast = []\n",
    "\n",
    "# Reduce the original series\n",
    "forecast_series = series[split_time - window_size:]\n",
    "\n",
    "# Use the model to predict data points per window size\n",
    "for time in range(len(forecast_series) - window_size):\n",
    "  forecast.append(model.predict(forecast_series[time:time + window_size][np.newaxis], verbose=0))\n",
    "\n",
    "# Convert to a numpy array and drop single dimensional axes\n",
    "results = np.array(forecast).squeeze()\n",
    "\n",
    "# Plot the results\n",
    "plot_series(time_valid, (x_valid, results))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Hn1QZd7LgCcu"
   },
   "source": [
    "You can optimize this step by leveraging Tensorflow models' capability to process batches. Instead of running the for-loop above which processes a single window at a time, you can pass in an entire batch of windows and let the model process that in parallel.\n",
    "\n",
    "The function below does just that. You will notice that it almost mirrors the `windowed_dataset()` function but it does not shuffle the windows. That's because we want the output to be in its proper sequence so we can compare it properly to the validation set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ym73y-pDKp-X"
   },
   "outputs": [],
   "source": [
    "def model_forecast(model, series, window_size, batch_size):\n",
    "    \"\"\"Uses an input model to generate predictions on data windows\n",
    "\n",
    "    Args:\n",
    "      model (TF Keras Model) - model that accepts data windows\n",
    "      series (array of float) - contains the values of the time series\n",
    "      window_size (int) - the number of time steps to include in the window\n",
    "      batch_size (int) - the batch size\n",
    "\n",
    "    Returns:\n",
    "      forecast (numpy array) - array containing predictions\n",
    "    \"\"\"\n",
    "\n",
    "    # Add an axis for the feature dimension of RNN layers\n",
    "    series = tf.expand_dims(series, axis=-1)\n",
    "    \n",
    "    # Generate a TF Dataset from the series values\n",
    "    dataset = tf.data.Dataset.from_tensor_slices(series)\n",
    "\n",
    "    # Window the data but only take those with the specified size\n",
    "    dataset = dataset.window(window_size, shift=1, drop_remainder=True)\n",
    "\n",
    "    # Flatten the windows by putting its elements in a single batch\n",
    "    dataset = dataset.flat_map(lambda w: w.batch(window_size))\n",
    "    \n",
    "    # Create batches of windows\n",
    "    dataset = dataset.batch(batch_size).prefetch(1)\n",
    "    \n",
    "    # Get predictions on the entire dataset\n",
    "    forecast = model.predict(dataset, verbose=0)\n",
    "    \n",
    "    return forecast"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "VrRFuUcfieQN"
   },
   "source": [
    "You can run the function below to use the function. Notice that the predictions are generated almost instantly.\n",
    "\n",
    "*Note: You might notice that the first line slices the `series` at `split_time - window_size:-1` which is a bit different from the slower for-loop code. That is because we want the model to have its last prediction to align with the last point of the validation set (i.e. `t=1460`). You were able to do that with the slower for-loop code by specifying the for-loop's `range()`. With the more efficient function above, you don't have that mechanism so you instead just remove the last point when slicing the `series`. If you don't, then the function will generate a prediction at `t=1461` which is outside the validation set range.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "m23ny5uh8KTt"
   },
   "outputs": [],
   "source": [
    "# Reduce the original series\n",
    "forecast_series = series[split_time - window_size:-1]\n",
    "\n",
    "# Use helper function to generate predictions\n",
    "forecast = model_forecast(model, forecast_series, window_size, batch_size)\n",
    "\n",
    "# Drop single dimensional axis\n",
    "results = forecast.squeeze()\n",
    "\n",
    "# Plot the results\n",
    "plot_series(time_valid, (x_valid, results))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7cVUoLH7k5NG"
   },
   "source": [
    "You can then compute the MSE and MAE. You can compare the results here when using other RNN architectures which you'll do in the next lab."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "cxK8haPCL48G"
   },
   "outputs": [],
   "source": [
    "# Compute the MSE and MAE\n",
    "print(tf.keras.metrics.mse(x_valid, results).numpy())\n",
    "print(tf.keras.metrics.mae(x_valid, results).numpy())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Wrap Up\n",
    "\n",
    "In the next lab, you will explore a similar architecture but using LSTMs. Before doing so, run the cell below to free up resources. You might see a pop-up about restarting the kernel afterwards. You can safely ignore it and just press Ok. You can then close this lab, then go back to the classroom for the next lecture. See you there!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Shutdown the kernel to free up resources. \n",
    "# Note: You can expect a pop-up when you run this cell. You can safely ignore that and just press `Ok`.\n",
    "\n",
    "from IPython import get_ipython\n",
    "\n",
    "k = get_ipython().kernel\n",
    "\n",
    "k.do_shutdown(restart=False)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "C4_W3_Lab_1_RNN.ipynb",
   "private_outputs": true,
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0rc1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
